Searching Massive Data Streams Using Multipattern Regular Expressions

نویسندگان

  • Jon Stewart
  • Joel Uckelman
چکیده

This paper describes the design and implementation of lightgrep, a multipattern regular expression search tool that efficiently searches massive data streams. lightgrep addresses several shortcomings of existing digital forensic tools by taking advantage of recent developments in automata theory. The tool directly simulates a nondeterministic finite automaton, and incorporates a number of practical optimizations related to searching with large pattern sets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple Pattern Matching Algorithms on Collage System

Compressed pattern matching is one of the most active topics in string matching. The goal is to find all occurrences of a pattern in a compressed text without decompression. Various algorithms have been proposed depending on underlying compression methods in the last decade. Although some algorithms for multipattern searching on compressed text were also presented very recently, all of them are...

متن کامل

Fine Classification & Recognition of Hand Written Devnagari Characters with Regular Expressions & Minimum Edit Distance Method

Regular expressions are extremely useful, because they allow us to work with text in terms of patterns. They are considered the most sophisticated means of performing operations such as string searching, manipulation, validation, and formatting in all applications that deal with text data. Character recognition problem scenarios in sequence analysis that are ideally suited for the application o...

متن کامل

A Dynamically Reconfigurable FPGA-Based Pattern Matching Hardware for Subclasses of Regular Expressions

In this paper, we propose a novel architecture for largescale regular expression matching, called dynamically reconfigurable bitparallel NFA architecture (Dynamic BP-NFA), which allows dynamic loading of regular expressions on-the-fly as well as efficient pattern matching for fast data streams. This is the first dynamically reconfigurable hardware with guaranteed performance for the class of ex...

متن کامل

Top-k Pattern Matching Using an Information-Theoretic Criterion over Probabilistic Data Streams

As the development of data mining technologies for sensor data streams, more sophisticated methods for complex event processing are demanded. In the case of event recognition, since event recognition results may contain errors, we need to deal with the uncertainty of events. We therefore consider probabilistic event data streams with occurrence probabilities of events, and develop a pattern mat...

متن کامل

String Pattern Matching Fora Deluge Survival

String Pattern Matching concerns itself with algorithmic and combi-natorial issues related to matching and searching on linearly arranged sequences of symbols, arguably the simplest possible discrete structures. As unprecedented volumes of sequence data are amassed, disseminated and shared at an increasing pace, eeective access to, and manipulation of such data depend crucially on the eeciency ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011